Regression on Distributed Databases via Secure Multi-Party Computation
نویسندگان
چکیده
We present a method for performing linear regression on the union of distributed databases that does not entail constructing an integrated database, and therefore preserves confidentiality of the individual databases. The method can be used by statistical agencies to share information from their individual databases, or to make such information available to others.
منابع مشابه
Secure analysis of distributed chemical databases without data integration
We present a method for performing statistically valid linear regressions on the union of distributed chemical databases that preserves confidentiality of those databases. The method employs secure multi-party computation to share local sufficient statistics necessary to compute least squares estimators of regression coefficients, error variances and other quantities of interest. We illustrate ...
متن کاملSecure Regression on Distributed Databases
This article presents several methods for performing linear regression on the union of distributed databases that preserve, to varying degrees, confidentiality of those databases. Such methods can be used by federal or state statistical agencies to share information from their individual databases, or to make such information available to others. Secure data integration, which provides the lowe...
متن کاملUnconditionally Secure Computation on Large Distributed Databases with Vanishing Cost
Consider a network of k parties, each holding a long sequence of n entries (a database), with minimum vertex-cut greater than t. We show that any empirical statistic across the network of databases can be computed by each party with perfect privacy, against any set of t ¡ k=2 passively colluding parties, such that the worst-case distortion and communication cost (in bits per database entry) bot...
متن کاملAnalysis of Integrated Data without Data Integration
M scientific and policy investigations require statistical analyses that “integrate” data stored in multiple, distributed databases. For example, a regression analysis on integrated state databases about factors influencing student performance would be more insightful than individual analyses, or complementary to them. Other contexts where the same need arises range from homeland security to en...
متن کاملSecure, Privacy-Preserving Analysis of Distributed Databases
There is clear value, in both industrial and government settings, derived from performing statistical analyses that, in effect, integrate data in multiple, distributed databases. However, the barriers to actually integrating the data can be substantial or even insurmountable. Corporations may be unwilling to share proprietary databases such as chemical databases held by pharmaceutical manufactu...
متن کامل